Exploiting Ontology Structure and Patterns of Annotation to Mine Significant Associations between Pairs of CV Terms [RESEARCH PAPER SUBMISSION]
نویسندگان
چکیده
There is significant knowledge captured through annotations on the life sciences Web. In past research, we developed a methodology of support and confidence metrics from association rule mining, to mine the association bridge (of termlinks) between pairs of controlled vocabulary (CV) terms across two ontologies. Our (naive) approach did not exploit the following: implicit knowledge captured via the hierarchical is-a structure of ontologies, and patterns of annotation in datasets that may impact the distribution of parent/child or sibling CV terms. In this research, we consider this knowledge. We aggregate termlinks over the siblings of a parent CV term and use them as additional evidence to boost support and confidence scores in the associations of the parent CV term. A weight factor (α) reflects the contribution from the child CV terms; its value can be varied to reflect a variance of confidence values among the sibling CV terms of some parent CV term. We illustrate the benefits of exploiting this knowledge through experimental evaluation.
منابع مشابه
Link Prediction for Annotation Graphs Using Graph Summarization
Annotation graph datasets are a natural representation of scientific knowledge. They are common in the life sciences where genes or proteins are annotated with controlled vocabulary terms (CV terms) from ontologies. The W3C Linking Open Data (LOD) initiative and semantic Web technologies are playing a leading role in making such datasets widely available. Scientists can mine these datasets to d...
متن کاملUsing Annotations from Controlled Vocabularies to Find Meaningful Associations
This paper presents the LSLink (or Life Science Link) methodology that provides users with a set of tools to explore the rich Web of interconnected and annotated objects in multiple repositories, and to identify meaningful associations. Consider a physical link between objects in two repositories, where each of the objects is annotated with controlled vocabulary (CV) terms from two ontologies. ...
متن کاملA Framework for Discovering Associations from the Annotated Biological Web
During the last decade, biomedical researchers gained access to the entire human genome, reliable high-throughput biotechnologies, and affordable computational resources and network access. In combination, these new tools created a new model for biomedical research that no longer uses computational tools merely to monitor research, but instead exploits these tools to acquire knowledge and make ...
متن کاملExploration Using Signatures in Annotation Graph Datasets
The widespread development and adoption of ontologies to capture semantic domain knowledge and the growth of annotation graph datasets has created many opportunities for large scale Linked Data analytics. Ontologies are developed by domain experts to capture knowledge specific to some domain. The biomedical community has taken the lead in these activities. Every model organism database has gene...
متن کاملA Framework for Discovering Meaningful Associations in the Annotated Life Sciences Web
Title of dissertation: A FRAMEWORK FOR DISCOVERING MEANINGFUL ASSOCIATIONS IN THE ANNOTATED LIFE SCIENCES WEB Woei-Jyh (Adam) Lee, Doctor of Philosophy, 2009 Dissertation directed by: Professor Louiqa Raschid Department of Computer Science During the last decade, life sciences researchers have gained access to the entire human genome, reliable high-throughput biotechnologies, affordable computa...
متن کامل